Overview

Dataset statistics

Number of variables 15
Number of observations 144458
Missing cells 1433459
Missing cells (%) 66.2%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 16.5 MiB
Average record size in memory 120.0 B

Variable types

DateTime 1
Categorical 4
Numeric 10

Dataset

Description Sensor that returns a label identifying the activity performed by the user, accurately detected using low power signals from multiple sensors in the device. This is achieved using Google’s Activity Recognition APIs. Possible activities are: still, in_vehicle, on_bycicle, on_foot, running, tilting, walking. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

university University where the experiment took place
experimentid Experiment Id
userid User id
day day showing month(2), day(2)
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
accuracy The highest accuracy for possible activities
label The activity name with highest accuracy
still The value of the "still" activity
on_foot The value of the "on_foot" activity
walking The value of the "walking" activity
running The value of the "running" activity
in_vehicle The value of the "in_vehicle" activity
on_bicycle The value of the "on_bycicle" activity
tilting The value of the "tilting" activity
unknown The value of the "unknown" activity

Alerts

tilting has constant value "100.0" Constant
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
experimentid is highly correlated with university and 1 other fields High correlation
university is highly correlated with experimentid and 1 other fields High correlation
label is highly correlated with tilting High correlation
tilting is highly correlated with experimentid and 2 other fields High correlation
university is highly correlated with experimentid and 2 other fields High correlation
experimentid is highly correlated with university and 1 other fields High correlation
userid is highly correlated with university High correlation
day is highly correlated with university and 1 other fields High correlation
accuracy is highly correlated with label and 5 other fields High correlation
label is highly correlated with accuracy and 6 other fields High correlation
still is highly correlated with accuracy and 6 other fields High correlation
on_foot is highly correlated with accuracy and 5 other fields High correlation
walking is highly correlated with accuracy and 6 other fields High correlation
running is highly correlated with on_foot and 1 other fields High correlation
in_vehicle is highly correlated with accuracy and 2 other fields High correlation
on_bicycle is highly correlated with label and 4 other fields High correlation
unknown is highly correlated with accuracy and 4 other fields High correlation
university has 96200 (66.6%) missing values Missing
experimentid has 96200 (66.6%) missing values Missing
userid has 96200 (66.6%) missing values Missing
day has 96200 (66.6%) missing values Missing
accuracy has 96200 (66.6%) missing values Missing
label has 96200 (66.6%) missing values Missing
still has 96904 (67.1%) missing values Missing
on_foot has 105707 (73.2%) missing values Missing
walking has 105989 (73.4%) missing values Missing
running has 113078 (78.3%) missing values Missing
in_vehicle has 104169 (72.1%) missing values Missing
on_bicycle has 107690 (74.5%) missing values Missing
tilting has 120919 (83.7%) missing values Missing
unknown has 101803 (70.5%) missing values Missing
timestamp has unique values Unique

Reproduction

Analysis started 2022-07-04 17:05:38.530108
Analysis finished 2022-07-04 17:06:09.804737
Duration 31.27 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 144458
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 1.1 MiB
Minimum 1900-03-15 17:34:00
Maximum 1900-06-24 01:11:00
2022-07-04T19:06:09.967840 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:10.261327 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

university
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

University where the experiment took place

Distinct 5
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
unitn
21888
num
16982
lse
4848
uc
3495
aau
1045

Length

Max length 5
Median length 3
Mean length 3.834700982
Min length 2

Characters and Unicode

Total characters 185055
Distinct characters 10
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row aau
2nd row aau
3rd row aau
4th row aau
5th row aau

Common Values

Value Count Frequency (%)
unitn 21888
15.2%
num 16982
11.8%
lse 4848
3.4%
uc 3495
2.4%
aau 1045
0.7%
(Missing) 96200
66.6%

Length

2022-07-04T19:06:10.532973 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:06:10.772522 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
unitn 21888
45.4%
num 16982
35.2%
lse 4848
10.0%
uc 3495
7.2%
aau 1045
2.2%

Most occurring characters

Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 185055
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring scripts

Value Count Frequency (%)
Latin 185055
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring blocks

Value Count Frequency (%)
ASCII 185055
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

experimentid
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Experiment Id

Distinct 2
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
wenet
26370
wenetUnitn
21888

Length

Max length 10
Median length 5
Mean length 7.267810518
Min length 5

Characters and Unicode

Total characters 350730
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 26370
18.3%
wenetUnitn 21888
15.2%
(Missing) 96200
66.6%

Length

2022-07-04T19:06:10.987696 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:06:11.206432 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 26370
54.6%
wenetunitn 21888
45.4%

Most occurring characters

Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 328842
93.8%
Uppercase Letter 21888
6.2%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 96516
29.4%
n 92034
28.0%
t 70146
21.3%
w 48258
14.7%
i 21888
6.7%
Uppercase Letter
Value Count Frequency (%)
U 21888
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 350730
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 350730
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

userid
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

User id

Distinct 60
Distinct (%) 0.1%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 30.01889842
Minimum 1
Maximum 132
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:11.620213 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 4
Q1 11
median 20
Q3 47
95-th percentile 70
Maximum 132
Range 131
Interquartile range (IQR) 36

Descriptive statistics

Standard deviation 23.3663621
Coefficient of variation (CV) 0.778388393
Kurtosis 1.120736143
Mean 30.01889842
Median Absolute Deviation (MAD) 11
Skewness 1.077134101
Sum 1448652
Variance 545.9868779
Monotonicity Not monotonic
2022-07-04T19:06:11.912302 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
20 3333
2.3%
9 2862
2.0%
10 2747
1.9%
17 2392
1.7%
12 2386
1.7%
63 2230
1.5%
27 2058
1.4%
74 1734
1.2%
18 1679
1.2%
1 1538
1.1%
Other values (50) 25299
17.5%
(Missing) 96200
66.6%
Value Count Frequency (%)
1 1538
1.1%
2 416
0.3%
3 57
< 0.1%
4 1079
0.7%
5 899
0.6%
6 194
0.1%
7 617
0.4%
8 902
0.6%
9 2862
2.0%
10 2747
1.9%
Value Count Frequency (%)
132 275
0.2%
124 85
0.1%
75 162
0.1%
74 1734
1.2%
73 98
0.1%
70 962
0.7%
69 131
0.1%
68 1135
0.8%
67 693
0.5%
65 337
0.2%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 44
Distinct (%) 0.1%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 464.1346305
Minimum 315
Maximum 624
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:12.202795 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 315
5-th percentile 318
Q1 325
median 404
Q3 611
95-th percentile 619
Maximum 624
Range 309
Interquartile range (IQR) 286

Descriptive statistics

Standard deviation 137.1398884
Coefficient of variation (CV) 0.2954743718
Kurtosis -1.904516334
Mean 464.1346305
Median Absolute Deviation (MAD) 86
Skewness 0.1043596868
Sum 22398209
Variance 18807.34899
Monotonicity Increasing
2022-07-04T19:06:12.469780 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
Value Count Frequency (%)
328 1440
1.0%
324 1415
1.0%
325 1410
1.0%
612 1394
1.0%
327 1392
1.0%
326 1384
1.0%
318 1373
1.0%
329 1362
0.9%
319 1362
0.9%
322 1336
0.9%
Other values (34) 34390
23.8%
(Missing) 96200
66.6%
Value Count Frequency (%)
315 82
0.1%
316 472
0.3%
317 1136
0.8%
318 1373
1.0%
319 1362
0.9%
320 1260
0.9%
321 1323
0.9%
322 1336
0.9%
323 1331
0.9%
324 1415
1.0%
Value Count Frequency (%)
624 1
< 0.1%
622 12
< 0.1%
621 848
0.6%
620 1218
0.8%
619 1257
0.9%
618 1081
0.7%
617 1123
0.8%
616 1179
0.8%
615 1197
0.8%
614 1306
0.9%

accuracy
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The highest accuracy for possible activities

Distinct 77
Distinct (%) 0.2%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 84.55932695
Minimum 24
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:12.753422 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 24
5-th percentile 40
Q1 70
median 99
Q3 100
95-th percentile 100
Maximum 100
Range 76
Interquartile range (IQR) 30

Descriptive statistics

Standard deviation 23.51700489
Coefficient of variation (CV) 0.2781124891
Kurtosis -0.4329443338
Mean 84.55932695
Median Absolute Deviation (MAD) 1
Skewness -1.165565129
Sum 4080664
Variance 553.0495192
Monotonicity Not monotonic
2022-07-04T19:06:13.031613 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 20381
14.1%
40 6350
4.4%
99 5374
3.7%
96 2464
1.7%
97 2123
1.5%
98 1593
1.1%
92 561
0.4%
50 414
0.3%
41 386
0.3%
85 323
0.2%
Other values (67) 8289
5.7%
(Missing) 96200
66.6%
Value Count Frequency (%)
24 2
< 0.1%
25 3
< 0.1%
26 9
< 0.1%
27 13
< 0.1%
28 14
< 0.1%
29 26
< 0.1%
30 25
< 0.1%
31 51
< 0.1%
32 44
< 0.1%
33 62
< 0.1%
Value Count Frequency (%)
100 20381
14.1%
99 5374
3.7%
98 1593
1.1%
97 2123
1.5%
96 2464
1.7%
95 250
0.2%
94 250
0.2%
93 234
0.2%
92 561
0.4%
91 178
0.1%

label
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

The activity name with highest accuracy

Distinct 6
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
Still
28796
Unknown
8253
Tilting
5723
OnFoot
2671
InVehicle
2608

Length

Max length 9
Median length 5
Mean length 5.867897551
Min length 5

Characters and Unicode

Total characters 283173
Distinct characters 20
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Still
2nd row Still
3rd row Still
4th row Still
5th row Still

Common Values

Value Count Frequency (%)
Still 28796
19.9%
Unknown 8253
5.7%
Tilting 5723
4.0%
OnFoot 2671
1.8%
InVehicle 2608
1.8%
OnBycicle 207
0.1%
(Missing) 96200
66.6%

Length

2022-07-04T19:06:13.315326 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:06:13.580485 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
still 28796
59.7%
unknown 8253
17.1%
tilting 5723
11.9%
onfoot 2671
5.5%
invehicle 2608
5.4%
onbycicle 207
0.4%

Most occurring characters

Value Count Frequency (%)
l 66130
23.4%
i 43057
15.2%
t 37190
13.1%
n 35968
12.7%
S 28796
10.2%
o 13595
4.8%
U 8253
2.9%
k 8253
2.9%
w 8253
2.9%
g 5723
2.0%
Other values (10) 27955
9.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 229429
81.0%
Uppercase Letter 53744
19.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
l 66130
28.8%
i 43057
18.8%
t 37190
16.2%
n 35968
15.7%
o 13595
5.9%
k 8253
3.6%
w 8253
3.6%
g 5723
2.5%
e 5423
2.4%
c 3022
1.3%
Other values (2) 2815
1.2%
Uppercase Letter
Value Count Frequency (%)
S 28796
53.6%
U 8253
15.4%
T 5723
10.6%
O 2878
5.4%
F 2671
5.0%
I 2608
4.9%
V 2608
4.9%
B 207
0.4%

Most occurring scripts

Value Count Frequency (%)
Latin 283173
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
l 66130
23.4%
i 43057
15.2%
t 37190
13.1%
n 35968
12.7%
S 28796
10.2%
o 13595
4.8%
U 8253
2.9%
k 8253
2.9%
w 8253
2.9%
g 5723
2.0%
Other values (10) 27955
9.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 283173
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
l 66130
23.4%
i 43057
15.2%
t 37190
13.1%
n 35968
12.7%
S 28796
10.2%
o 13595
4.8%
U 8253
2.9%
k 8253
2.9%
w 8253
2.9%
g 5723
2.0%
Other values (10) 27955
9.9%

still
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "still" activity

Distinct 101
Distinct (%) 0.2%
Missing 96904
Missing (%) 67.1%
Infinite 0
Infinite (%) 0.0%
Mean 67.43853304
Minimum 0
Maximum 100
Zeros 2
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:13.846813 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 3
Q1 10
median 97
Q3 100
95-th percentile 100
Maximum 100
Range 100
Interquartile range (IQR) 90

Descriptive statistics

Standard deviation 40.14394767
Coefficient of variation (CV) 0.5952672139
Kurtosis -1.425188787
Mean 67.43853304
Median Absolute Deviation (MAD) 3
Skewness -0.6332223744
Sum 3206972
Variance 1611.536535
Monotonicity Not monotonic
2022-07-04T19:06:14.133814 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 15972
11.1%
10 9402
6.5%
99 6025
4.2%
98 1601
1.1%
96 1586
1.1%
97 1518
1.1%
1 1438
1.0%
2 719
0.5%
92 321
0.2%
50 283
0.2%
Other values (91) 8689
6.0%
(Missing) 96904
67.1%
Value Count Frequency (%)
0 2
< 0.1%
1 1438
1.0%
2 719
0.5%
3 266
0.2%
4 281
0.2%
5 131
0.1%
6 142
0.1%
7 78
0.1%
8 170
0.1%
9 47
< 0.1%
Value Count Frequency (%)
100 15972
11.1%
99 6025
4.2%
98 1601
1.1%
97 1518
1.1%
96 1586
1.1%
95 82
0.1%
94 70
< 0.1%
93 81
0.1%
92 321
0.2%
91 66
< 0.1%

on_foot
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_foot" activity

Distinct 98
Distinct (%) 0.3%
Missing 105707
Missing (%) 73.2%
Infinite 0
Infinite (%) 0.0%
Mean 21.00807721
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:14.427105 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 8
median 10
Q3 14
95-th percentile 96
Maximum 100
Range 99
Interquartile range (IQR) 6

Descriptive statistics

Standard deviation 29.28963286
Coefficient of variation (CV) 1.394208169
Kurtosis 2.211614416
Mean 21.00807721
Median Absolute Deviation (MAD) 3
Skewness 1.967311211
Sum 814084
Variance 857.8825928
Monotonicity Not monotonic
2022-07-04T19:06:14.724718 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 16652
11.5%
1 4254
2.9%
2 1790
1.2%
96 979
0.7%
4 920
0.6%
8 808
0.6%
3 739
0.5%
100 698
0.5%
97 694
0.5%
11 675
0.5%
Other values (88) 10542
7.3%
(Missing) 105707
73.2%
Value Count Frequency (%)
1 4254
2.9%
2 1790
1.2%
3 739
0.5%
4 920
0.6%
5 609
0.4%
6 654
0.5%
7 530
0.4%
8 808
0.6%
9 504
0.3%
10 16652
11.5%
Value Count Frequency (%)
100 698
0.5%
99 46
< 0.1%
98 344
0.2%
97 694
0.5%
96 979
0.7%
95 213
0.1%
94 202
0.1%
93 204
0.1%
92 300
0.2%
91 168
0.1%

walking
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "walking" activity

Distinct 94
Distinct (%) 0.2%
Missing 105989
Missing (%) 73.4%
Infinite 0
Infinite (%) 0.0%
Mean 21.01910629
Minimum 0
Maximum 100
Zeros 75
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:15.018927 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 8
median 10
Q3 14
95-th percentile 96
Maximum 100
Range 100
Interquartile range (IQR) 6

Descriptive statistics

Standard deviation 29.23189464
Coefficient of variation (CV) 1.390729664
Kurtosis 2.229808091
Mean 21.01910629
Median Absolute Deviation (MAD) 3
Skewness 1.971230936
Sum 808584
Variance 854.5036643
Monotonicity Not monotonic
2022-07-04T19:06:15.311591 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 16748
11.6%
1 4285
3.0%
2 1579
1.1%
96 983
0.7%
4 804
0.6%
8 752
0.5%
3 721
0.5%
97 693
0.5%
11 678
0.5%
100 670
0.5%
Other values (84) 10556
7.3%
(Missing) 105989
73.4%
Value Count Frequency (%)
0 75
0.1%
1 4285
3.0%
2 1579
1.1%
3 721
0.5%
4 804
0.6%
5 608
0.4%
6 604
0.4%
7 533
0.4%
8 752
0.5%
9 507
0.4%
Value Count Frequency (%)
100 670
0.5%
99 41
< 0.1%
98 342
0.2%
97 693
0.5%
96 983
0.7%
95 211
0.1%
94 202
0.1%
93 193
0.1%
92 301
0.2%
91 167
0.1%

running
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "running" activity

Distinct 72
Distinct (%) 0.2%
Missing 113078
Missing (%) 78.3%
Infinite 0
Infinite (%) 0.0%
Mean 8.444741874
Minimum 0
Maximum 100
Zeros 143
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:15.607750 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 10
median 10
Q3 10
95-th percentile 10
Maximum 100
Range 100
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 4.979313408
Coefficient of variation (CV) 0.5896347671
Kurtosis 111.5047655
Mean 8.444741874
Median Absolute Deviation (MAD) 0
Skewness 6.884520519
Sum 264996
Variance 24.79356201
Monotonicity Not monotonic
2022-07-04T19:06:15.900028 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 24141
16.7%
1 3678
2.5%
2 1683
1.2%
3 467
0.3%
4 447
0.3%
8 237
0.2%
6 165
0.1%
0 143
0.1%
5 92
0.1%
7 48
< 0.1%
Other values (62) 279
0.2%
(Missing) 113078
78.3%
Value Count Frequency (%)
0 143
0.1%
1 3678
2.5%
2 1683
1.2%
3 467
0.3%
4 447
0.3%
5 92
0.1%
6 165
0.1%
7 48
< 0.1%
8 237
0.2%
9 21
< 0.1%
Value Count Frequency (%)
100 2
< 0.1%
99 6
< 0.1%
98 1
< 0.1%
97 6
< 0.1%
96 2
< 0.1%
95 2
< 0.1%
93 1
< 0.1%
92 3
< 0.1%
90 2
< 0.1%
86 1
< 0.1%

in_vehicle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "in_vehicle" activity

Distinct 93
Distinct (%) 0.2%
Missing 104169
Missing (%) 72.1%
Infinite 0
Infinite (%) 0.0%
Mean 19.20030281
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:16.387264 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 7
median 10
Q3 17
95-th percentile 96
Maximum 100
Range 99
Interquartile range (IQR) 10

Descriptive statistics

Standard deviation 26.23587386
Coefficient of variation (CV) 1.366430213
Kurtosis 3.575959102
Mean 19.20030281
Median Absolute Deviation (MAD) 5
Skewness 2.215164537
Sum 773561
Variance 688.3210772
Monotonicity Not monotonic
2022-07-04T19:06:16.673076 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 15155
10.5%
1 5706
3.9%
2 1887
1.3%
8 1182
0.8%
96 1057
0.7%
23 838
0.6%
97 748
0.5%
4 725
0.5%
15 696
0.5%
3 668
0.5%
Other values (83) 11627
8.0%
(Missing) 104169
72.1%
Value Count Frequency (%)
1 5706
3.9%
2 1887
1.3%
3 668
0.5%
4 725
0.5%
5 428
0.3%
6 555
0.4%
7 352
0.2%
8 1182
0.8%
9 362
0.3%
10 15155
10.5%
Value Count Frequency (%)
100 84
0.1%
99 86
0.1%
98 190
0.1%
97 748
0.5%
96 1057
0.7%
95 144
0.1%
94 195
0.1%
93 158
0.1%
92 198
0.1%
91 109
0.1%

on_bicycle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "on_bycicle" activity

Distinct 90
Distinct (%) 0.2%
Missing 107690
Missing (%) 74.5%
Infinite 0
Infinite (%) 0.0%
Mean 8.700364447
Minimum 0
Maximum 100
Zeros 68
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:16.971259 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 15
Maximum 100
Range 100
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 10.37516392
Coefficient of variation (CV) 1.192497622
Kurtosis 47.51706577
Mean 8.700364447
Median Absolute Deviation (MAD) 0
Skewness 6.244231838
Sum 319895
Variance 107.6440263
Monotonicity Not monotonic
2022-07-04T19:06:17.254169 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 18745
13.0%
1 4382
3.0%
2 3496
2.4%
3 1922
1.3%
4 1648
1.1%
6 988
0.7%
8 986
0.7%
5 940
0.7%
7 474
0.3%
12 334
0.2%
Other values (80) 2853
2.0%
(Missing) 107690
74.5%
Value Count Frequency (%)
0 68
< 0.1%
1 4382
3.0%
2 3496
2.4%
3 1922
1.3%
4 1648
1.1%
5 940
0.7%
6 988
0.7%
7 474
0.3%
8 986
0.7%
9 280
0.2%
Value Count Frequency (%)
100 53
< 0.1%
99 53
< 0.1%
98 26
< 0.1%
97 75
0.1%
96 28
< 0.1%
95 8
< 0.1%
94 6
< 0.1%
93 8
< 0.1%
92 8
< 0.1%
91 9
< 0.1%

tilting
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

The value of the "tilting" activity

Distinct 1
Distinct (%) < 0.1%
Missing 120919
Missing (%) 83.7%
Memory size 1.1 MiB
100.0
23539

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 117695
Distinct characters 3
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 100.0
2nd row 100.0
3rd row 100.0
4th row 100.0
5th row 100.0

Common Values

Value Count Frequency (%)
100.0 23539
16.3%
(Missing) 120919
83.7%

Length

2022-07-04T19:06:17.502094 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:06:17.694326 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
100.0 23539
100.0%

Most occurring characters

Value Count Frequency (%)
0 70617
60.0%
1 23539
20.0%
. 23539
20.0%

Most occurring categories

Value Count Frequency (%)
Decimal Number 94156
80.0%
Other Punctuation 23539
20.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 70617
75.0%
1 23539
25.0%
Other Punctuation
Value Count Frequency (%)
. 23539
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 117695
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 70617
60.0%
1 23539
20.0%
. 23539
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 117695
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 70617
60.0%
1 23539
20.0%
. 23539
20.0%

unknown
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "unknown" activity

Distinct 87
Distinct (%) 0.2%
Missing 101803
Missing (%) 70.5%
Infinite 0
Infinite (%) 0.0%
Mean 16.16113
Minimum 0
Maximum 100
Zeros 209
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:06:17.894600 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 1
median 2
Q3 40
95-th percentile 44
Maximum 100
Range 100
Interquartile range (IQR) 39

Descriptive statistics

Standard deviation 20.41814927
Coefficient of variation (CV) 1.263410991
Kurtosis -0.2534925914
Mean 16.16113
Median Absolute Deviation (MAD) 1
Skewness 0.950896691
Sum 689353
Variance 416.9008195
Monotonicity Not monotonic
2022-07-04T19:06:18.185946 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 15165
10.5%
40 11364
7.9%
2 9436
6.5%
3 1957
1.4%
8 480
0.3%
41 379
0.3%
50 248
0.2%
15 227
0.2%
0 209
0.1%
31 192
0.1%
Other values (77) 2998
2.1%
(Missing) 101803
70.5%
Value Count Frequency (%)
0 209
0.1%
1 15165
10.5%
2 9436
6.5%
3 1957
1.4%
4 68
< 0.1%
6 16
< 0.1%
8 480
0.3%
9 2
< 0.1%
10 12
< 0.1%
11 2
< 0.1%
Value Count Frequency (%)
100 17
< 0.1%
98 11
< 0.1%
96 7
< 0.1%
94 12
< 0.1%
93 4
< 0.1%
92 19
< 0.1%
91 9
< 0.1%
90 6
< 0.1%
89 16
< 0.1%
88 4
< 0.1%

Interactions

2022-07-04T19:06:04.587302 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:42.821958 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:45.352266 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:47.713757 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:50.139623 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:52.615688 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:54.925718 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:57.235996 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:59.705469 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:02.057818 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:04.823301 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:43.067991 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:45.589263 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:47.957362 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:50.375522 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:52.844455 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:55.159151 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:57.467624 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:59.940849 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:02.297703 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:05.053439 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:43.316346 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:45.833276 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:48.203082 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:50.611043 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:53.078682 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:55.398786 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:57.697849 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:00.186754 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:02.537434 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:05.297687 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:43.570265 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:46.081562 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:48.459185 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:50.863994 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:53.324886 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:55.648016 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:58.130196 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:00.434182 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:02.779494 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:05.527942 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:43.802855 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:46.314868 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:48.697580 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:51.085160 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:53.548141 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:55.874867 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:58.356700 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:00.667271 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:03.009871 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:05.753557 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:44.035867 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:46.544348 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:48.931068 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:51.305418 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:53.777332 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:56.104799 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:58.576633 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:00.904061 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:03.235674 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:05.982478 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:44.266252 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:46.776902 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:49.169054 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:51.715891 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:54.004631 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:56.328360 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:58.802224 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:01.130952 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:03.467839 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:06.209051 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:44.497920 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:47.012429 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:49.403251 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:51.940923 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:54.230311 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:56.558009 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:59.027949 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:01.364913 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:03.693360 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:06.435665 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:44.734615 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:47.244957 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:49.661979 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:52.164827 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:54.462973 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:56.784731 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:59.251800 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:01.594977 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:03.925154 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:06.661960 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:44.963531 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:47.481274 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:49.896951 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:52.388419 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:54.695427 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:57.010718 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:05:59.476760 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:01.829332 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:06:04.151373 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T19:06:18.429844 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T19:06:18.777733 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T19:06:19.129957 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T19:06:19.455606 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T19:06:19.698091 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T19:06:07.073221 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T19:06:07.858924 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T19:06:08.864755 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T19:06:09.522749 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.